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Abstract 

Artificial Intelligence (AI) achieved super-human per¬ 
formance in a broad variety of domains. We say that 
an AI is made Artificially Stupid on a task when some 
limitations are deliberately introduced to match a hu¬ 
man’s ability to do the task. An Artificial General In¬ 
telligence (AGI) can be made safer by limiting its com¬ 
puting power and memory, or by introducing Artificial 
Stupidity on certain tasks. We survey human intellec¬ 
tual limits and give recommendations for which limits 
to implement in order to build a safe AGI. 

Introduction 

The Turing Test |T| was designed to replace the philosophi¬ 
cal question "Can machines think ?" with a more pragmatic 
one: "Can digital computers imitate human behaviors in an¬ 
swering text questions ?" Since 1990, the Loebner Prize 
competition awards the AI that has the more human-like be¬ 
haviour when passing a five-minutes Turing Test. This con¬ 
test is controversial within the field of AI for encouraging 
low-quality interactions and simplistic chatbots 0- 

Programmers force chatbots to make mistakes, such as 
typing errors, to be more human-like. Because computers 
achieve super-human performance in some tasks, such as 
arithmetic JT| or video game ||3j, their ability may need to be 
artificially constrained. Those deliberate mistakes are called 
Artificial Stupidity in the media 0 0- 

Solving the Turing Test problem in its general form im¬ 
plies building an AI capable of delivering human-like replies 
to any question of human interest. As such, it is known to be 
Al-Complete [6], because by solving this problem we would 
be able to solve most of problems of interest in Artificial In¬ 
telligence by reformulating them as questions during a Tur¬ 
ing Test. To appear human, an AI will need to fully under¬ 
stand human limits and biases. Thus, the Turing Test can be 
used to test if an AI is capable of understanding human stu¬ 
pidity. 

By deliberately limiting an AI’s ability to achieve a task, 
to better match humans’ ability, an AI can be made safer, 
in the sense that its capabilities will not exceed by sev¬ 
eral orders of magnitude humans’ abilities. Upper-bounds 
on the number of operations made per second by a human 
brain have been estimated 0 0 - To obtain an AI that does 


not exceed by far humans’ abilities, for instance in arith¬ 
metics, the computing power allowed for mathematical ca¬ 
pabilities must be artificially diminished. Besides, humans 
exhibit cognitive biases, which result in systematic errors in 
judgment and decision making [j9]. In order to build a safe 
AGI, some of those biases may need to be replicated. 

We will start by introducing the concept of Artificial Stu¬ 
pidity. Then, we will recommend limitations to build a safer 
AGI. 

Artificial Stupidity 
Passing the Turing Test 

To pass the Turing Test, programs that are deliberately sim¬ 
plistic perform better at Turing Test contests such as the 
Loebner Prize. The computer program A.L.I.C.E. (or Artifi¬ 
cial Linguistic Internet Computer Entity) | FOj won the Loeb¬ 
ner Prize in 2000, 2001 and 2004, even though "there is no 
representation of knowledge, no common-sense reasoning, 
no inference engine to mimic human thought. Just a very 
long list of canned answers, from which it picks the best 
option" cni- A.L.I.C.E. has an approach similar to ELIZA 
© it identifies some relevant keywords and give appropri¬ 
ate answers without learning anything about the interrogator 

HD- 

A general trend for computer programs written for the 
Loebner prize is to avoid being asked questions it cannot 
answer, by directing the conversation towards simpler con¬ 
versational context. For A.L.I.C.E. and ELIZA, that means 
focusing mainly of what had been said in the last few sen¬ 
tences (stateless context). Another example of an AI per¬ 
forming well at Turing Test contests is Eugene Goostman, 
who convinced 33% of the judges that it was human G3- 
Goostman is portrayed as a thirteen-year-old Ukrainian boy 
who does not speak English well. Thus, Goostman makes 
typing mistakes and its interrogators are more inclined to 
forgive its grammatical errors or lack of general knowledge. 
Introducing deliberate mistakes, what we call Artificial Stu¬ 
pidity , is necessary to cover up an even greater gap in intel¬ 
ligence during a Turing Test. 

Interacting with Humans 

Outside those very specific Turing Test contests. Artificial 
Stupidity is being increasingly introduced to interact with 



humans. In “Artificial Stupidity: The Art of Intentional Mis¬ 
takes” |3j, Liden describes the design choices an AI pro¬ 
grammer must make in the context of video games. He gives 
general principles that a Non-Player Character (NPC) must 
follow to make the game playable. For instance, NPCs must 
"move before firing" (e.g. by rolling when the player enters 
the room) so that the player has additional time to under¬ 
stand that a fight is happening, or "miss the first time" to in¬ 
dicate the direction of the attack without hurting the player. 
In video games, because computer programs can be much 
more capable than human beings (for instance because of 
their perfect aim in First Person Shooters (FPS)), develop¬ 
ers force NPCs to make mistakes to make life easier for the 
human player. 

This tendency to make Al deliberately stupid can be ob¬ 
served across multiple domains. For example, at Google 
I/O 2018, Sundar Pichai introduced Google Duplex, "a new 
technology for conducting natural conversations to carry out 
“real world” tasks over the phone" ]T4|. In the demo, the 
Google Assistant used this Google Duplex technology to 
successfully make an appointment with a human. To that 
end, it used the interjection "uh" to imitate the space filler 
humans use in day-to-day conversations. This interjection 
was not necessary to make the appointment. More precisely, 
when humans use these kind of space fillers, it is a sign of 
poor communication skills. The developers introduced this 
type of Artificial Stupidity to make the call more fluid, more 
human-like. 

Exploiting Human Vulnerabilities 

When interacting with AI, humans want to fulfill some basic 
needs and desires. The AI can exploit these cravings, by giv¬ 
ing people what they want. Accordingly, the AI may want 
to appear vulnerable to make the human feel better about 
himself. For instance, Liden |3| suggests to give NPCs "hor¬ 
rible aim", so that humans feel superior because they think 
they are dodging gunshots. Similarly, he encourages "kung- 
fu style" fights, where amongst dozens of enemies, only two 
are effectively attacking the player at each moment. Thus, 
the player feels powerful (because he believes he is fighting 
multiple enemies). Instead of unleashing its full potentiality, 
the AI designers diminish its power to please the players. 

The same mechanism applies to the computer program 
A.L.I.C.E.. The program delivers a simple verbal behav¬ 
ior , and because "a great deal of our interactions with oth¬ 
ers involves verbal behavior, and many people are interested 
in what happens when you talk to someone "(11], it fulfills 
this elementary craving of obtaining a verbal behavioral re¬ 
sponse. In his "Artifical Stupidity" article, part 2, Sundman 
quotes Wallace, the inventor of A.L.I.C.E, explaining why 
he thinks humans enjoy talking to his program: "It’s merely 
a machine designed to formulate answers that will keep you 
talking. And this strategy works, [...] because that’s what 
people are: mindless robots who don’t listen to each other 
but merely regurgitate canned answers." 

More generally, virtual assistants and chatbots are tech¬ 
nologies aimed at helping consumers. To that end, they must 
provide value. They might achieve that by gaining users’ 
thrust and improving their overall well-being. For instance, 


Woebot, a Facebook chatbot developed by Stanford Re¬ 
searchers, improves users’ mental-health through Cognitive 
Behavorial Therapy (CBT) 1 15]. In addition to CBT, Woebot 
uses two therapeutic process-oriented features. First, it pro¬ 
vides empathic responses to the user, according to the mood 
the user said he had. Second, content tailored to its mood is 
presented to the user. ffb) . The chatbot delivers specifically 
what will help the user best, without describing the details 
of the user’s mental-health. It gives a very simple answer the 
consumer wanted, masking the intelligence of its algorithm. 
Chatbots are designed to help humans, giving appropriate 
simple responses. They are not designated to appear smarter 
than humans. 

Avoiding Superintelligence requires 
Superintelligence 

In "Computing Machinery and Intelligence" [T], Turing ex¬ 
poses common fallacies when arguing that a machine cannot 
pass the Turing Test. In particular, he explains why the belief 
that "the interrogator could distinguish the machine from the 
man simply by setting them a number of problems in arith¬ 
metic" because "the machine would be unmasked because of 
its deadly accuracy" is false. Indeed, the machine "would not 
attempt to give the right answers to the arithmetic problems. 
It would deliberately introduce mistakes in a manner calcu¬ 
lated to confuse the interrogator." Thus, the machine would 
hide its super-human ability by giving a wrong answer, or 
simply saying that he could not compute it. Similarly, in a 
video game, AI designers artificially make the AI not omni¬ 
scient, so that it does not miraculously guess where each and 
every weapon of the game is located [;3j. 

The general trend here is that AI tend to quickly achieve 
super-human level performance after having achieved 
human-level performance. For instance, for the game of Go, 
in a few months, the state-of-the-art went from strong am¬ 
ateur, to weak professional player, to super-human perfor¬ 
mance HU- From that point onwards, to make the AI pass a 
Turing Test, or make it behave human-like to satisfy human 
desires, AI designers must deliberately limit its capabilities. 

We now take the point of view of algorithmic complexity. 
We call Al-problems the problems that can be solved (i.e. for 
which the correct output for a given input can be computed) 
by the union of all humans (611. The Turing Test is said to be 
Al-complete @ , because all Al-problems can be reduced via 
a polynomial-time reduction to it (by framing the problem 
as a question during a Turing Test), and it is an Al-problem. 
Additionally, we say that a problem is AI-Hard if we can find 
a polynomial-time reduction from an Al-complete problem 
to this problem. In this setting, the problem of comprehen¬ 
sively specifying the limits of the human cognition to an AI 
is AI-Hard. Indeed, it implies being capable to know which 
questions a human can answer in a given time during a Tur¬ 
ing Test, and which answers are likely to be produced by 
humans. Therefore, the Turing Test can be reduced polyno- 
mially to specifying the limits of human cognition, so spec¬ 
ifying such limits is AI-Hard. 

Although the precise limits of human cognition are not 
fully known, specific recommendations on minima or max¬ 
ima for different capabilities can be given. 


The Cognitive Limits of the Human Brain 

I believe that in about fifty years time it will be possible 
to programme computers, with a storage capacity of 
about 10 9 , to make them play the imitation game so 
well that an average interrogator will not have more 
than 70 percent chance of making the right 
identification after five minutes of questioning. 

Turing, Computing Machinery and Intelligence, 1950 

Sixty-eight years ago, Turing estimated that in the 2000s, 
an AI with only 1 Gb in storage could pass a five minutes 
Turing Test 30% of the time. Previously, we showed that 
passing the Turing Test in general was Al-complete. The 
amount of computing resources necessary to pass the Tur¬ 
ing Test is then a relevant estimate for determining the com¬ 
puting power necessary to attain Human-Level Machine In¬ 
telligence (HLMI). In what follows, we try to estimate the 
computing power of the human brain. 

Limits in Computing 

The brain is a complex system with an architecture com¬ 
pletely different from the usual von Neumann computer ar¬ 
chitecture. However, estimates about the storage capacity of 
the brain and the number of operations per second were at¬ 
tempted. 

Long-term Memory Here is how Turing 0 justifies the 
10 9 bits of storage capacity: 

“Estimates of the storage capacity of the brain vary 
from 10 10 to 10 15 binary digits. I incline to the lower 
values and believe that only a very small fraction is 
used for the higher types of thinking. Most of it is prob¬ 
ably used for the retention of visual impressions.” 

The storage capacity of the brain is generally considered 
to be within the bounds given by Turing (resp. 10 10 and 
10 15 ). Although the encoding of information in our brains is 
different from the encoding in a computer, we observe many 
similarities dj. 

To estimate the storage capacity of the human brain, we 
first evaluate the number of synapses available in the brain. 
The human brain has about 100 billion neurons OH- Each 
neuron has about five thousand potential synapses, so this 
amounts to about 5.10 14 synapses [8j, so 5.10 14 potential 
datapoints. 

This shows that the brain could in theory encode be¬ 
tween 10 12 and 10 15 bits of information, assuming that each 
synapses stores one bit. However, such estimates are still 
approximate because neuroscientists do not know precisely 
how synapses actually encode information: some of them 
can encode multiple bits by transmitting different strengths, 
and individual synapses are not completely independent 



Processing Even though the brain can encode Terabits 
of information, humans are in practice very limited in the 
amount of information they can process. 

In his classical article m , Miller showed how our minds 
could only hold about 7 ± 2 concepts in our working mem¬ 
ory. More generally, three essential bottlenecks were shown 


to limit information processes in the brain: the Attentional 
Blink (AB) limits our ability to consciously perceive, the 
Visual Short-Term Memory (VSTM) our capacity to hold 
in mind, and the Psychological Refractory Period (RFP) our 
ability to act upon the visual world |22j. In particular, the 
brain takes up to 100ms to process complex images [23) . 

Moreover, the processing time seems to take longer when 
the choice to make takes as input a complex information. 
This is known as Hick’s Law (24) : the time it takes to make 
a choice is linearly related to the entropy of the possible al¬ 
ternatives. 

Computing One approach to evaluate the complexity of 
the processes happening in the brain is to estimate the max¬ 
imum number of operations per second. Thus, Moravec (7) 
estimates that to replicate all human’s function as a whole 
one would need about 100 MIPS (Millions of Instructions 
per Second), by comparing it to the computational needs for 
edge extraction in robotics. Using the same estimation for 
the number of synapses in the brain (mentioned in Memory), 
Bostrom [81 concludes that the brain uses at most about 10 17 
operations per second (for a survey of the different estimates 
of the computational capacity of the brain, see (Bostrom, 
2008) (25)). 

Clock Speed The brain does not operate with a central 
clock. That’s why the term "clock-speed" does not describe 
accurately processes happening in the brain. However, it if 
possible to compare the transmission of information in the 
brain and inside a computer. 

Processes emerge and dissolve in parallel in different 
parts of the brain at different frequency bands: theta(5-8Hz), 
alpha(9-12Hz), beta(14-28Hz) and gamma(40-80Hz). Com¬ 
paring computer and brain frequences, Bostrom notes that 
“biological neurons operate at a peak speed of about 200 
Hz, a full seven orders of magnitude slower than a modern 
microprocessor (~2 GHz)” [ 261. 

It is important to note that clock speed, alone, do not fully 
characterize the performances of a processor (27). Further¬ 
more, the processes happening in the brain use several orders 
of magnitude more parallelization than modern processors. 

Cognitive biases 

Humans, like other animals, see the world through the 

lens of evolved adaptation 

The Evolution of Cognitive Bias, 2005 (5) 

Natural selection have shaped human perception, with the 
constraint of limited computational power. Cognitive biases 
led humans to draw inferences or adopt beliefs without cor¬ 
responding empirical evidence. The fundamental work of 
Tversky and Kahneman [28) highlighted the existence of 
heuristics and systemic errors in judgment. However, those 
biases helped to solve adaptive problems, i.e. “problems that 
recurred across many generations during a species’ evolu¬ 
tionary history, and whose solution statistically promoted re¬ 
production in ancestral environments” (29). 




Limited rationality The economist Herbert A. Simon op¬ 
posed the classical view of the rational economic agent. 
He viewed humans as organisms with limited computational 
powers, and introduced the concept of bounded rationality 
to take those limits into account in decision-making. (30). 

Cosmides and Tobby also criticized the study of eco¬ 
nomic agents as following "rational" decision rules, with¬ 
out studying the "computational devices" inside. Natural se¬ 
lection’s invisible hand created the human mind, and eco¬ 
nomics is made of the interactions of those minds. Evolu¬ 
tion led humans to develop domain-specific functions, rather 
than general-purpose problem-solvers. The intelligence of 
humans comes from those specific "reasoning instincts" that 
make inferences “just as easy, effortless, and "natural" to hu¬ 
mans as spinning a web is to a spider or building a dam is to 
a beaver” (29). 


Heuristics The intractability of certain problems, the lim¬ 
ited computational power of human minds, and uncertainty, 
are the most common ways to explain cognitive biases, and 
in particular heuristics, which are “rules of thumb that are 
prone to breakdown in systematic ways” 0. Heuristics aim 
at reducing the cost of computing while delivering good- 
enough solutions. Processes are limited by brain ontogeny, 
i.e. the development of different parts of the brain, and com¬ 
plex algorithms take longer and require additional resources. 

One of the most famous bias resulting from mental short¬ 
cuts is the "Linda Problem": individuals consider the asser¬ 
tion "Linda is a bank teller and active in the feminist move¬ 
ment" more probable than "Linda is a bank teller". This er¬ 
ror is caused by the conjunction fallacy, i.e. believing that 
the conjunction of two events is more probable than a single 
event (31). Another notorious example is the Fundamental 
Attribution Error, attributing certain mental states to individ¬ 
uals because of their behavior, and not because of the logical 
implications of the context 1321 


Error management biases Error Management Theory 
(EMT) studies cognitive biases in the context of error man¬ 
agement. It distinguishes two error types ||9): 

• false positives (false belief) 

• false negatives (failing to adopt a true belief) 


Recommendations to Build a Safer AGI 

Humans have clear computational constraints (memory, pro¬ 
cessing, computing and clock speed) and have developed 
cognitive biases. An Artificial General Intelligence (AGI) is 
not a priori constrained by such computational and cogni¬ 
tive limits. 

Hence, if humans do not deliberately limit an AGI in its 
hardware and software, it could become a superintelligence, 
i.e. "any intellect that greatly exceeds the cognitive perfor¬ 
mance of humans in virtually all domains of interest" (26) , 
and humans could lose control over the AI. In this section, 
we discuss how to constrain an AGI to be less capable than 
an average person, or equally, while still exhibiting general 
intelligence. In order to achieve this, resources such as mem¬ 
ory, clock speed, or electricity must be restricted. 

However, intelligence is not just about computing. 
Bostrom distinguishes three forms of superintelligence: 
speed superintelligence (“can do all that a human intellect 
can do, but much faster”), collective superintelligence (“A 
system composed of a large number of smaller intellects 
such that the system’s overall performance across many very 
general domains vastly outstrips that of any current cogni¬ 
tive system.”) and quality superintelligence (“A system that 
is at least as fast as a human mind and vastly qualitatively 
smarter”) (26) . A hardware-limited AI could be human- 
level-intelligent in speed, but still qualitatively superintel- 
ligent. 

Hardware 

To begin with, we focus on how to avoid speed superintelli¬ 
gence by limiting the AI’s hardware. For instance, its max¬ 
imum number of operations per second can be bounded by 
the maximum number of operations a human does. Simi¬ 
larly, by limiting its RAM (or anything that can be used as a 
working memory), we limit its processing power to process 
information at the similar rate as humans. 

Focusing only on limiting the hardware is nonetheless in¬ 
sufficient. We assume that, in parallel, there exists other lim¬ 
itations (in software) that prevents the AI to become qualita¬ 
tively superintelligent, to upgrade its hardware by changing 
its own physical structure, or to just buy computing power 
online. 


One of the findings of EMT is that humans are biased to 
make the less costly error, even if it is the most frequent (34) . 
Biases of this kind include 0: 

• Protective biases (e.g. avoiding noninfectious person) 

• Bias in Interpersonal Perception (for instance sexual 
overperception for male and commitment skepticism for 
female) 

• Positive Illusions (estimates unrealistic likelihoods for 
positive events) 

Artifacts Humans might appear irrational in experiments 
because the tested abilities were not optimized by evolution. 
Those are called biases as artifacts [9). For instance, humans 
are better at statistical prediction if the inputs are presented 
in frequency form (35). 


Storage Capacity 

I should be surprised if more than 10 9 was required for 
satisfactory playing of the imitation game, at any rate 
against a blind man. [...] A storage capacity of 10' 
would be a very practicable possibility even by present 
techniques. 

Turing, Computing Machinery and Intelligence, 1950 

We estimated the storage capacity of the human brain to 
be at most 10 15 bits, using one bit per synapse. The cost 
of hard drives have reached $0.05/Gb in 2017 [ 361. Hence, 
the storage capacity of a human brain would cost at most 
approximately $50,000. This is a pessimistic estimate: the 
brain uses maybe orders of magnitude less information for 
storage, and the price of a Gb could decrease even lower in 
the future. 




To have a safe AGI, one should rather use much less stor¬ 
age capacity. For instance, as quoted in the epigraph, Turing 
[11 estimated 10' (or about 10Mb) to be a practical storage 
capacity to pass the Turing Test (and therefore attain AGI). 
Even if this seems very low, consider that an AGI could have 
a very elegant data structure and semantics, that may allow 
to store information much more concisely than our brains. In 
comparison, English Wikipedia in compressed text is about 
12Gb, and is growing at a steady rate of lGb/year [37]. For 
this reason, allowing more than 10 Gb of storage capacity is 
unsafe. With 10Gb of storage, it could have access to an of¬ 
fline version of Wikipedia permanently, and be qualitatively 
superintelligent in the sense that it would have direct access 
to all human knowledge. 

A counter-argument for such memory limit is that our 
brains process much more information than 10Mb when ob¬ 
serving the world, and would store all those images in our 
long-term memory. The human-eye could observe, at most, 
576Mb in a single glance (forgetting about all the visual 
flaws) J38) . However, all this resolution is not necessary to 
perform edge detection and image recognition. For instance, 
a 75x50 pixels image is enough to identify George W. Bush 
[39), and the popular database for handwritten digit recogni¬ 
tion MNIST use 28x28 images |40||. Thus, we can imagine 
a "visual processing unit", that would transform the photons 
received by the captor of the AGI into a low resolution im¬ 
age, precise enough to be interpreted by our AGI, but still 
orders of magnitude smaller in size than a Mb. 


Memory access Manuel Blum opposes the application of 
traditional complexity theory to formalize how humans pro¬ 
cess information and do mental computations, in particular 
to generate a password from a private key previously mem¬ 
orized ED- In his Human-Model, memory can be modeled 
as a two-tape Turing machine: one for long-term memory, 
one for short-term memory. Blum considers potentially in¬ 
finite tapes, because the size of the tape is not relevant for 
complexity theory, but for our purpose we can consider the 
tapes to be at most the size discussed previously for mem¬ 
ory (e.g. 10Mb). According to Miller’s magical number 7±2 
human working-memory works with a limited amount 
of chuncks. So our two-tape Turing Machine should have 
a very short "short-term memory" tape, containing at most 
two or three 64-bit pointers pointing to chunks in the long¬ 
term memory (the other tape). More specifically, storing in¬ 
formation in the long-term memory is slow, but reading from 
long-term memory (given the correct pointer) is fast. 

In modern computers, RAM’s bandwidth is about 
lOGBytes/s, hard drive’s bandwidth is lOOMBytes/s, and 
with high clock rate a CPU can process about 25GBytes/s 
m . In order to build a safer AGI, the memory access for the 
two mentioned tapes must be restricted, so that we are sure 
that data is being retrieved slower than by humans. However, 
the computing paradigms being very different, it is difficult 
to give a clear estimate of how much information is being 
retrieved per second by the brain 03 - 

Processing We previously mentioned how the human 
brain could only process a limited amount of information 
per second. In addition to a limited number of chunks in 


working memory, other features must also be implemented 
to slow down the AGI and make it human-level intelligent. 
For instance, introduce some artificial period to process in¬ 
formation, such as images, depending on the content. We 
already commented on the necessary duration of 100ms to 
process complex images |23 |. Similarly, the amount of time 
to process a certain image might depend on the complex¬ 
ity and size of the image. More generally, a model similar 
to Hick’s Law J24| can be implemented, to have the AGI 
take linearly more time to take decisions as the information- 
theoretic entropy of the decision increases. 

Clock Speed As we mentioned multiple times, the brain 
parallelizes much more, using a totally different computing 
paradigm than the von Neumann architecture. Therefore, us¬ 
ing a clock rate close to the frequency of the brain frequen¬ 
cies (typically - 10Hz) is not relevant to our purpose, and it 
might prove difficult to build an AGI that exhibits human- 
level intelligence in real time using such a low clock rate. 

To solve this, one possibility is to first measure better the 
trajectory of thoughts occuring in the brain, and then give a 
precise estimate of how frequently the processes in the brain 
are refreshed (i.e. evaluating some kind of clock rate). An¬ 
other solution is to abandon the von Neumann architecture, 
and build the AGI with a computer architecture more similar 
to the human brain. 

Computing In "The Cognitive Limits of the Human 
Brain", we mentioned Bostrom’s estimate [81 of at most 10 17 
operations per second for the brain. This is a very large num¬ 
ber, and could only happen if the AGI’s hardware allowed 
that much computing power. This will not be the case, ac¬ 
cording to what we said previously in Storage capacity and 
Memory access. 

More importantly, even if we could measure a number of 
operations per second that would actually be lower than any 
number of operations per second a human brain does for any 
given task, it might not be a correct bound. Why? The brain 
has evolved to achieve some very specific tasks, useful for 
evolution, but nothing guarantees that the complexity or the 
processes happening in the brain are algorithmically opti¬ 
mal. Thus, the AGI could possess a structure that would be 
far more optimized for computing than the human brain. 

Therefore, restricting the number of operations alone is 
insufficient: the algorithmic processes and the structure of 
the AGI must be precisely defined so it is clear that the re¬ 
sulting processes happening are performing tasks at a lower 
rate than humans. 

Software 

In November 2017, there were more than 45 companies in 
the world working on Artificial General Intelligence J42J. 
Ben Goertzel distinguishes three major approaches J43) : 

• 1) Use neural networks to emulate the different parts of 
the brains, visual and auditory processing, and connect all 
those parts together by emulating how those parts talk to 
each other. Deepmind is a representative example. 

• 2) Take Marcus Hutter’s Universal Artificial Intelligence 
model |44) and try to limit the required computing power 



• 3) Ben Goertzel’s approach with OpenCog: look at the 
cognitive processes happening in the brain from a high- 
level cognitive point of view, and map this into a weighted 
labeled hypergraph. 

In this paper, in order to build a safe AGI with at most 
human intelligence, we will focus on the first approach. In¬ 
deed, a more universal or high-level Artificial General Intel¬ 
ligence will have a very different computing paradigm than 
the human brain, so it would be difficult to restrain the AGI 
computing resources accordingly. 

In addition to those neural networks emulating processes 
happening in the brain, we consider additional safety mea¬ 
sures that must be implemented to obtain a safe AGI. 

No self-improvement A limited initial hardware is not a 
real restriction if the AGI can buy some additional comput¬ 
ing power online or change its own structure. To prevent the 
AGI from changing its own code, one possibility is to hard- 
code the feature directly "you shall not rewrite your own 
code". Another (subtler) possibility is to encrypt its source 
code, making self-modification more difficult. 

However, it might seem that this does not completely 
solve the problem, as the AGI could manipulate humans into 
changing its code. Yet, with our AGI design, the AGI would 
not be superintelligent, but at most human-intelligent. So the 
AGI would not have the necessary "social manipulation su¬ 
perpowers" p6| to convince humans to change its code. 

Cognitive Biases Humans have developed cognitive bi¬ 
ases because of natural selection. Incorporating human bi¬ 
ases into the AGI present several advantages: they can limit 
the AGI’s intelligence and make the AGI fundamentally 
safer by avoiding behaviors that might harm humans. Thus, 
heuristics limiting the possible results of a computation, or 
error management biases, can help build a less capable AGI 
that would also make less errors. 

Here is a list of cognitive biases that could make the AGI 
safer f45j: 

• Planning fallacy: would prohibit the AGI from success¬ 
fully planning a takeover or a treacherous turn 

• Bandwagon Effect: the AGI will acquire human values 
that are shared among the group it belongs to 

• Confirmation Bias : the AGI will rationalize and confirm 
that it is useful to help humans, or that AI Safety is an 
important problem 

• Conservatism : the AGI will keep the same initial values, 
and not become evil 

• Courtesy bias: the AGI will try to not offend anyone, 
avoiding aggressive behaviors 

• Functional Fixedness : the AGI will only use objects like 
humans do. It will not “hack” anything or use objects with 
malicious intent 

• Information bias : the AGI will have the tendency to seek 
information and think more, avoiding errors 

• Mere-exposure effect: the AGI will have good intention 
towards humans because it will be exposed to humans 


• Omission bias : bad actions will be much lower in its util¬ 
ity function than status quo 

• Status quo bias: an action that does not change the envi¬ 
ronment or even inaction will be desirable 

• Authority bias: some human values are shared among 
most authority figures. The AGI will avoid extreme in¬ 
dividual opinions and prefer conventional ones 

• System justification: the AGI will defend human values, 
and won’t promote revolution 

• Processing difficulty effect: the AGI will prefer well- 
thought arguments and well-argued ideas over simple ac¬ 
tions or decisions that can be harmful 

• Spotlight effect: the AGI will overestimate the amount of 
people observing its behavior. Hence, it will behave in a 
similar fashion both with high and low supervision 

It is clear that some of those biases can lead to irrational 
behavior. Therefore, even though those biases can lead to 
safer behavior, they must be implemented with caution. 

Avoiding a treacherous turn In the Chapter 8 of Super¬ 
intelligence J26| , Bostrom notes the following: behaving 
nicely in a sandbox environment is a convergent instrumen¬ 
tal goal, both for a friendly or a malicious AI. At one point, 
an AI could exhibit enough intelligence to understand that it 
is being observed and could restrain its own intelligence ac¬ 
cordingly. Bostrom calls this hi st moment of realization “the 
conception of deception”, and would be the last moment of 
vulnerability where we could observe a true malicious in¬ 
tent. After this point, the behaviors will appear "safe", until 
a certain threshold is reached, where the AI does not feel 
threatened anymore by its supervisors, and decides to take a 
treacherous turn and reveal its true nature by taking over. 

In the implementation of the AI, however, we can decide 
to make it very clear what processes are happening, when, 
and how much computing power it took to do each task. This 
interface would allow the human supervisors to better mon¬ 
itor the behavior of the AI in the sandbox environment, and 
detect a treacherous turn. Additionally, the designers can im¬ 
plement a safety feature that consists in preventing the AI 
from erasing any of the log files, or from even knowing that 
it is being observed. 

Conclusion 

In the history of Artificial Intelligence, one of the great¬ 
est challenge has been to pass the Turing Test. To win at 
the Imitation Game, chatbots are made Artificially Stupid. 
More generally, introducing Artificial Stupidity into an AI 
can improve its interaction with humans (for instance in 
video-game), but can also be a safety measure to better con¬ 
trol an AGI. In this paper, we proposed a design of a safer 
and humanly-manageable AGI. It would be hardware con¬ 
strained, so it would have less computing power than hu¬ 
mans, but also an architecture very similar to human brains. 
Additionally, some features in software might help avoid 
self-improvement, a treacherous turn, or just make the AI 
safer (e.g. cognitive biases). 


















This approach presents several limitations and makes 

multiple assumptions: 

• this paper is limited to the case where the first approach 
to AGI (emulating a human-brain with neural networks) 
is possible, and will be the first approach to succeed 

• prohibiting the AGI from hardware or software self- 
improvement might prove a very difficult problem to 
solve, and may even be incompatible with corrigibility 

• it may be impossible to build an AI that simultaneously 
operates under such heavy constraints and still exhibits 
general intelligence 

• this approach does not generalize: it is impossible to build 
a safe Superintelligence from this AGI design 

Therefore, progress must still be made to generalize this 

approach. Future directions of this research include: 

• exploring other limits of human cognition 

• determining how to apply those limits for different AGI 
designs (Marcus Flutter’s approach or Ben Goertzel’s ap¬ 
proach for instance) 
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